14 research outputs found

    Hierarchical Multi-resolution Mesh Networks for Brain Decoding

    Full text link
    We propose a new framework, called Hierarchical Multi-resolution Mesh Networks (HMMNs), which establishes a set of brain networks at multiple time resolutions of fMRI signal to represent the underlying cognitive process. The suggested framework, first, decomposes the fMRI signal into various frequency subbands using wavelet transforms. Then, a brain network, called mesh network, is formed at each subband by ensembling a set of local meshes. The locality around each anatomic region is defined with respect to a neighborhood system based on functional connectivity. The arc weights of a mesh are estimated by ridge regression formed among the average region time series. In the final step, the adjacency matrices of mesh networks obtained at different subbands are ensembled for brain decoding under a hierarchical learning architecture, called, fuzzy stacked generalization (FSG). Our results on Human Connectome Project task-fMRI dataset reflect that the suggested HMMN model can successfully discriminate tasks by extracting complementary information obtained from mesh arc weights of multiple subbands. We study the topological properties of the mesh networks at different resolutions using the network measures, namely, node degree, node strength, betweenness centrality and global efficiency; and investigate the connectivity of anatomic regions, during a cognitive task. We observe significant variations among the network topologies obtained for different subbands. We, also, analyze the diversity properties of classifier ensemble, trained by the mesh networks in multiple subbands and observe that the classifiers in the ensemble collaborate with each other to fuse the complementary information freed at each subband. We conclude that the fMRI data, recorded during a cognitive task, embed diverse information across the anatomic regions at each resolution.Comment: 18 page

    To Invest or Not to Invest: Using Vocal Behavior to Predict Decisions of Investors in an Entrepreneurial Context

    Get PDF
    Entrepreneurial pitch competitions have become increasinglypopular in the start-up culture to attract prospective investors. As theultimate funding decision often follows from some form of social interaction,it is important to understand how the decision-making processof investors is influenced by behavioral cues. In this work, we examinewhether vocal features are associated with the ultimate funding decisionof investors by utilizing deep learning methods.We used videos of individualsin an entrepreneurial pitch competition as input to predict whetherinvestors will invest in the startup or not. We proposed models that combinedeep audio features and Handcrafted audio Features (HaF) and feedthem into two types of Recurrent Neural Networks (RNN), namely LongShort-Term Memory (LSTM) and Gated Recurrent Units (GRU). Wealso trained the RNNs with only deep features to assess whether HaFprovide additional information to the models. Our results show that it ispromising to use vocal behavior of pitchers to predict whether investorswill invest in their business idea. Different types of RNNs yielded similarperformance, yet the addition of HaF improved the performance

    To Invest or Not to Invest: Using Vocal Behavior to Predict Decisions of Investors in an Entrepreneurial Context

    Get PDF
    Entrepreneurial pitch competitions have become increasinglypopular in the start-up culture to attract prospective investors. As theultimate funding decision often follows from some form of social interaction,it is important to understand how the decision-making processof investors is influenced by behavioral cues. In this work, we examinewhether vocal features are associated with the ultimate funding decisionof investors by utilizing deep learning methods.We used videos of individualsin an entrepreneurial pitch competition as input to predict whetherinvestors will invest in the startup or not. We proposed models that combinedeep audio features and Handcrafted audio Features (HaF) and feedthem into two types of Recurrent Neural Networks (RNN), namely LongShort-Term Memory (LSTM) and Gated Recurrent Units (GRU). Wealso trained the RNNs with only deep features to assess whether HaFprovide additional information to the models. Our results show that it ispromising to use vocal behavior of pitchers to predict whether investorswill invest in their business idea. Different types of RNNs yielded similarperformance, yet the addition of HaF improved the performance

    Elucidating the Exposure Bias in Diffusion Models

    Full text link
    Diffusion models have demonstrated impressive generative capabilities, but their 'exposure bias' problem, described as the input mismatch between training and sampling, lacks in-depth exploration. In this paper, we systematically investigate the exposure bias problem in diffusion models by first analytically modelling the sampling distribution, based on which we then attribute the prediction error at each sampling step as the root cause of the exposure bias issue. Furthermore, we discuss potential solutions to this issue and propose an intuitive metric for it. Along with the elucidation of exposure bias, we propose a simple, yet effective, training-free method called Epsilon Scaling to alleviate the exposure bias. We show that Epsilon Scaling explicitly moves the sampling trajectory closer to the vector field learned in the training phase by scaling down the network output (Epsilon), mitigating the input mismatch between training and sampling. Experiments on various diffusion frameworks (ADM, DDPM/DDIM, EDM, LDM), unconditional and conditional settings, and deterministic vs. stochastic sampling verify the effectiveness of our method. For example, our ADM-ES, as a SOTA stochastic sampler, obtains 2.17 FID on CIFAR-10 dataset under 100-step unconditional generation. The code is available at \url{https://github.com/forever208/ADM-ES} and \url{https://github.com/forever208/EDM-ES}.Comment: under revie

    What will Your Future Child Look Like? Modeling and Synthesis of Hereditary Patterns of Facial Dynamics

    No full text
    Analysis of kinship from facial images or videos is an important problem. Prior machine learning and computer vision studies approach kinship analysis as a verification or recognition task. In this paper, first time in the literature, we propose a kinship synthesis framework, which generates smile videos of (probable) children from the smile videos of parents. While the appearance of a child's smile is learned using a convolutional encoder-decoder network, another neural network models the dynamics of the corresponding smile. The smile video of the estimated child is synthesized by the combined use of appearance and dynamics models. In order to validate our results, we perform kinship verification experiments using videos of real parents and estimated children generated by our framework. The results show that generated videos of children achieve higher correct verification rates than those of real children. Our results also indicate that the use of generated videos together with the real ones in the training of kinship verification models, increases the accuracy, suggesting that such videos can be used as a synthetic dataset

    Does the Strength of Sentiment Matter? A Regression Based Approach on Turkish Social Media

    No full text
    Social media posts are usually informal and short in length. They may not always express their sentiment clearly. Therefore, multiple raters may assign different sentiments to a tweet. Instead of employing majority voting which ignores the strength of sentiments, the annotation can be enriched with a confidence score assigned for each sentiment. In this study, we analyze the effect of using regression on confidence scores in sentiment analysis using Turkish tweets. We extract hand-crafted features including lexical features, emoticons and sentiment scores. We also employ word embedding of tweets for regression and classification. Our findings reveal that employing regression on confidence scores slightly improves sentiment classification accuracy. Moreover, combining word embedding with hand-crafted features reduces the feature dimensionality and outperforms alternative feature combinations

    Gender classification using mesh networks on multiresolution multitask fMRI data

    No full text
    Brain connectivity networks have been shown to represent gender differences under a number of cognitive tasks. Recently, it has been conjectured that fMRI signals decomposed into different resolutions embed different types of cognitive information. In this paper, we combine multiresolution analysis and connectivity networks to study gender differences under a variety of cognitive tasks, and propose a machine learning framework to discriminate individuals according to their gender. For this purpose, we estimate a set of brain networks, formed at different resolutions while the subjects perform different cognitive tasks. First, we decompose fMRI signals recorded under a sequence of cognitive stimuli into its frequency subbands using Discrete Wavelet Transform (DWT). Next, we represent the fMRI signals by mesh networks formed among the anatomic regions for each task experiment at each subband. The mesh networks are constructed by ensembling a set of local meshes, each of which represents the relationship of an anatomical region as a weighted linear combination of its neighbors. Then, we estimate the edge weights of each mesh by ridge regression. The proposed approach yields 2CL functional mesh networks for each subject, where C is the number of cognitive tasks and L is the number of subband signals obtained after wavelet decomposition. This approach enables one to classify gender under different cognitive tasks and different frequency subbands. The final step of the suggested framework is to fuse the complementary information of the mesh networks for each subject to discriminate the gender. We fuse the information embedded in mesh networks formed for different tasks and resolutions under a three-level fuzzy stacked generalization (FSG) architecture. In this architecture, different layers are responsible for fusion of diverse information obtained from different cognitive tasks and resolutions. In the experimental analyses, we use Human Connectome Project task fMRI dataset. Results reflect that fusing the mesh network representations computed at multiple resolutions for multiple tasks provides the best gender classification accuracy compared to the single subband task mesh networks or fusion of representations obtained using only multitask or only multiresolution data. Besides, mesh edge weights slightly outperform pairwise correlations between regions, and significantly outperform raw fMRI signals. In addition, we analyze the gender discriminative power of mesh edge weights for different tasks and resolutions

    Encoding the local connectivity patterns of fMRI for cognitive task and state classification

    No full text
    In this work, we propose a novel framework to encode the local connectivity patterns of brain, using Fisher vectors (FV), vector of locally aggregated descriptors (VLAD) and bag-of-words (BoW) methods. We first obtain local descriptors, called mesh arc descriptors (MADs) from fMRI data, by forming local meshes around anatomical regions, and estimating their relationship within a neighborhood. Then, we extract a dictionary of relationships, called brain connectivity dictionary by fitting a generative Gaussian mixture model (GMM) to a set of MADs, and selecting codewords at the mean of each component of the mixture. Codewords represent connectivity patterns among anatomical regions. We also encode MADs by VLAD and BoW methods using k-Means clustering. We classify cognitive tasks using the Human Connectome Project (HCP) task fMRI dataset and cognitive states using the Emotional Memory Retrieval (EMR). We train support vector machines (SVMs) using the encoded MADs. Results demonstrate that, FV encoding of MADs can be successfully employed for classification of cognitive tasks, and outperform VLAD and BoW representations. Moreover, we identify the significant Gaussians in mixture models by computing energy of their corresponding FV parts, and analyze their effect on classification accuracy. Finally, we suggest a new method to visualize the codewords of the learned brain connectivity dictionary

    Predicting Vasovagal Reactions to Needles from Facial Action Units

    Get PDF
    Background: Merely the sight of needles can cause extreme emotional and physical (vasovagal) reactions (VVRs). However, needle fear and VVRs are not easy to measure nor prevent as they are automatic and difficult to self-report. This study aims to investigate whether a blood donors’ unconscious facial microexpressions in the waiting room, prior to actual blood donation, can be used to predict who will experience a VVR later, during the donation. Methods: The presence and intensity of 17 facial action units were extracted from video recordings of 227 blood donors and were used to classify low and high VVR levels using machine-learning algorithms. We included three groups of blood donors as follows: (1) a control group, who had never experienced a VVR in the past (n = 81); (2) a ‘sensitive’ group, who experienced a VVR at their last donation (n = 51); and (3) new donors, who are at increased risk of experiencing a VVR (n = 95). Results: The model performed very well, with an F1 (=the weighted average of precision and recall) score of 0.82. The most predictive feature was the intensity of facial action units in the eye regions. Conclusions: To our knowledge, this study is the first to demonstrate that it is possible to predict who will experience a vasovagal response during blood donation through facial microexpression analyses prior to donation
    corecore